Monitoring Partial Updates in Web Pages Using Relational Learning

نویسندگان

  • Seiji Yamada
  • Yuki Nakai
چکیده

This paper describes an automatic monitoring system that constantly checks partial updates in Web pages and notifies them to a user. While one of the most important advantages of the WWW is frequent updates of Web pages, we need to constantly check them out and this task may take much cognitive load. Unfortunately applications to automatically check such updates can not deal with partial updates like updates in a particular cell of a table in a Web page. Hence we developed a automatic monitoring system that checks such partial updates. A user can give a system regions in which he/she wants to know the updates in a Web page as training examples, and it is able to learn rules to identify the partial updates by relational learning. We implemented the system and some executed examples were presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analyzing new features of infected web content in detection of malicious web pages

Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...

متن کامل

Web Page Classification Using Relational Learning Algorithm and Unlabeled Data

Applying relational tri-training (R-tri-training for short) to web page classification is investigated in this paper. R-tri-training, as a new relational semi-supervised learning algorithm, is well suitable for learning in web page classification. The semi-supervised component of R-tritraining allows it to exploit unlabeled web pages to enhance the learning performance effectively. In addition,...

متن کامل

Shift it to the Server!-Let the Database Server Update Your Websites

From the beginnings of the World Wide Web, Web site administrators have used dynamically generated HTML pages to provide up-to-date information, e.g., online news, stock quotes, etc. Due to the high resource consumption of dynamic page generation approaches, many sites have switched over to periodical updates of frequently visited pages, e.g., a headline index of an electronic newspaper. Howeve...

متن کامل

Relational Learning: A Web-Page Classification Viewpoint

This paper organises some general observations on Relational Learning which arose from research into classifying Web pages. The motivation for this piece is to contribute towards developing a broad overview of the field, so as to understand which aspects of Relational Learning are common to all domains, and which aspects are peculiar to specific domains. Hence the views presented here are neces...

متن کامل

Web pages ranking algorithm based on reinforcement learning and user feedback

The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002